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Abstract —We investigate the maximum coding rate achievable 
over a two-user broadcast channel for the scenario where a com¬ 
mon message is transmitted using variable-length stop-feedback 
codes. Specifically, upon decoding the common message, each de¬ 
coder sends a stop signal to the encoder, which transmits contin¬ 
uously until it receives both stop signals. For the point-to-point 
case, Polyanskiy, Poor, and Verdii (2011) recently demonstrated 
that variable-length coding combined with stop feedback signifi¬ 
cantly Increases the speed at which the maximum coding rate con¬ 
verges to capacity. This speed-up manifests Itself in the absence of 
a square-root penalty in the asymptotic expansion of the maximum 
coding rate for large blocklengths, a result a.k.a. zero dispersion. 
In this paper, we show that this speed-up does not necessarily oc¬ 
cur for the broadcast channel with common message. Specifically, 
there exist scenarios for which variable-length stop-feedback codes 
yield a positive dispersion. 

I. Introduction 

We consider the setup where an encoder wishes to convey a 
common message over a broadcast channel with noiseless feed¬ 
back to two decoders. Similarly to the single-decoder (SD) case, 
noiseless feedback combined with fixed-blocklength codes does 
not improve capacity, which is given by m p. 126] 

G = supmin{/(P,lUi),/(P,W' 2 )}. (1) 

p 

Here, Wi and W 2 denote the channels to decoder 1 and 2, 
respectively, and the supremum is over all input distributions P. 
For the case when there is no feedback, the speed at which C is 
approached as the blocklength n increases is of the order 1 / y/n 
Q (same as in the SD case). The constant factor associated to 
the lls/n term is commonly referred to as channel dispersion. 

For the SD case, noiseless feedback combined with variable- 
length codes improve significantly the speed of convergence to 
capacity. Specifically, it was shown in is] that 

= (2) 

where I stands for the average blocklength (average transmis¬ 
sion time). Ml {I, e) is the maximum number of codewords in 
the SD case, and C denotes the corresponding capacity. One sees 
from (|2]l that no square-root penalty occurs (zero dispersion), 
which implies a fast convergence to the asymptotic limit. This 
fast convergence is demonstrated numerically in ||3] by means of 
nonasymptotic bounds. Variable-length stop-feedback (VLSF) 
codes, i.e., coding schemes where the feedback is used only to 
stop transmissions, are sufficient to achieve (|2]i. 

The purpose of this paper is to investigate whether a similar 
result holds for the broadcast channel with common message. 


Contribution: We consider the subclass of discrete memory¬ 
less broadcast channels for which I{P, Wi) and I{P, W 2 ) are 
maximized by the same input distribution P*, which we assume 
to be unique. In this case, C = min{/(P*, IUi),/(P*, IT 2 )}. 
Focusing on the case when VLSF codes are used, we obtain 
nonasymptotic achievability and converse bounds on the maxi¬ 
mum number of codewords e) with average blocklength I 

that can be transmitted with reliability 1 — e. Here, the subscript 
“sf” stands for stop feedback. By analyzing these bounds in 
the large-Z regime, we prove that when the two subchannels 
are independent and have the same capacity and the same dis¬ 
persion, and when e < 0.1968, the asymptotic expansion of 
e) contains a square-root penalty (see (ITSi) and (l22l i for 
a precise statement of this result). Hence, the fast convergence 
to the asymptotic limit experienced in the SD case cannot be 
expected. 

The intuition behind this result is as follows: in the SD case, 
the stochastic variations of the information density that result 
in the square-root penalty can be virtually eliminated by using 
variable-length coding with stop-feedback. Indeed, decoding is 
stopped after the information density exceeds a certain thresh¬ 
old, which yields only negligible stochastic variations. In the 
broadcast setup, however, the stochastic variations in the dif¬ 
ference between the stopping times at the two decoders make 
the square-root penalty reappear. Note that our result does not 
necessarily imply that feedback is useless. It only shows that 
VLSF codes cannot be used to speed-up convergence to the 
same level as in the SD case. 

Proof techniques: The achievability bound is an extension 
of in Th 3]; the converse bound is based on an optimal stop¬ 
ping problem, where the probability that the stopping time ex¬ 
ceeds a given threshold is minimized under a constraint on the 
“stopped” information density process. The asymptotic analysis 
of the converse bound relies on Hoeffding’s inequality and on 
the Berry-Esseen central limit theorem, whereas the asymptotic 
analysis of the achievability bound relies on asymptotic results 
for random walks H and on a Berry-Esseen-type theorem that 
holds for random summations 0. 

Notation: Upper case, lower case, and calligraphic letters 
denote random variables (RV), deterministic quantities, and 
sets, respectively. The probability density function of a stan¬ 
dard Gaussian RV is denoted by ^(a;). Eurthermore, $(x) = 
1 — Q{x) is its cumulative distribution, where Q{x) is the Q- 
function. We let a;+ and x~ denote max(0, a;) and min{0, a;}, 
respectively. Throughout the paper, the index k belongs always 




to the set {1, 2}, although this is sometimes omitted. Further¬ 
more, k = 3 — k. We adopt the convention that ot = 0 

for all {oi} and all integers j. We use “c” to denote a finite 
nonnegative constant. Its value may change at each occurrence. 
Finally, N denotes the set of positive integers and Z+ = NU{0}. 


II. System Model 


A common-message discrete memoryless broadcast channel 
with two decoders is defined by the finite input alphabet X and 
the finite output alphabets yk, along with the stochastic matrices 
Wk, where Wk{yk\x) denotes the probability that yk G yk is 
observed at decoder k given x G X. We assume that the outputs 
at each time i are conditionally independent given the input, i.e.. 


= Wi{yi^i\xi)W2{y2,i\^i)- ( 3 ) 

Define the set of probability distributions on X by V{X). Let 
P X Wk '■ (x,yk) —>■ P(x)W(yk\x) denote the joint distribu¬ 
tion of input and output at decoder k, and let PWk ■ yk 

Pix)Wk{yk\x) denote the marginal distribution on yk- 
For every P G P{X), the information density is defined as 


2=1 


Wk{yk,i\xi) 

PWk{yk,i) 


(4) 


We let I{P,Wk) = Ep^wAwwAX-Wk)] be the mu¬ 
tual information, V{P,Wk) = Varpx^fc)] be 
the (unconditional) information variance, and T{P,Wk) = 
EpxWk[\^P,Wk{^lYk) - I{P,Wk)\A ’^he third absolute 
moment of the information density. We restrict ourselves to 
the case, where there exists a unique probability distribution 
P* G V{X) that maximizes simultaneously both/(P, fFi) and 
I{P, W 2 ). In this case, the capacity is given by 

C' = min{Ci,C2} (5) 


where Ck = I{P*, Wk) - The corresponding (unique) capacity- 
achieving output distributions are denoted by Pp^. Finally, we 
also define the dispersions Vk = V{P*, Wk)- 

We are now ready to formally define a VLSF code for the 
broadcast channel with common message. 

Definition 1: An ((, M, e)-VLSF code for the broadcast 
channel with common message consists of: 

1) ARV U GU, with \U\ < 3, which is known by the encoder 
and by both decoders. 

2) A sequence of encoders fn'-UxAi^X, each one map¬ 
ping the message JsAT = {l,...,M}, drawn uniformly 
at random, to the channel input according to = /„ {U, J). 

3) Two nonnegative integer-valued RVs ti and T 2 that are 
stopping times with respect to the filtrations P{U, Yfi) and 
P(U, Yp), respectively, and which satisfy 


E[max{ri, r2}] < ^- (6) 


4) A sequence of decoders gk^n ■ x yp M. satisfying 


Pr[J^Pfc..JC7,y;'')]<e, 


Remark 1: The RV U serves as common randomness, and 
enables the use of randomized codes 0. To establish the car¬ 
dinality bound on U, we proceed as in ||3] Th. 19] to show that 
\U\ < 4 is sufficient. This bound can be further improved to 
\U\ < 3 by using the Fenchel-Eggleston theorem ||2l p. 35]. 

Remark!: VLSF codes require a feedback link from the 
decoders to the encoder. This feedback consists of a 1-bit stop 
signal per decoder, which is sent by decoder k at time r^. The 
encoder continuously transmits until both decoders have fed 
back a stop signal. Hence, the blocklength is max{ri, r 2 }. 

Our aim is to characterize the largest number of codewords 
Mp{l, e), whose average length is I, that can be transmitted with 
reliability 1 — e using a VLSF code. 

III. Main Results 
A- Achievability bound 

We first present an achievability bound. Its proof (omitted) 
follows closely the proof of 0 Th. 3]. 

Theorem 1: FixP G P(A’).Let 7 i ,72 > OandO < q < Ibe 
arbitrary scalars. Let the stopping times and fk, k G {1, 2}, 
be defined as 

Tk = inf{n > 0 : ip^w^ T^”) > 7 ^} ( 8 ) 

fk ^ inf {n > 0 : ip,w, (X”; Y^) > 7 ^ } (9) 

where (-V", X", Yfi, YP") are jointly distributed according to 
P^n xn _y-p ,y" (t" , x" , 1/" , 1/2 ) 

n 

= vAxA H Pix.)P{x,)- (10) 

For every M, there exists an (Z, M, e)-VLSF code such that 

^ < (1 - g)E[max{ri,T2}] (11) 

and 

e < q + {1 - q){M - l)¥r[Tk > fk] - (12) 

Remarks: Following the same steps as in 0 Eq. (111)- 
(118)], e in (fTSl i can be further upper-bounded as 

e < q-f (1 - g)(M - l)exp{- 7 fc} . (13) 

This bound is easier to evaluate and to analyze asymptotically. 
B- Converse bound 

Let Px" G P(A’) be the type 0 Def. 2.1] of the sequence 
x" G X"- We are now ready to state our converse bound. 

Theorem 2: Eor every M, t G Z+ and ^ > 0, let 

At = logM — log log M — 6 — {\X\ — 1) log(f -I-1) (14) 

and let 

2 

- n > At] } 

k^l 

-I- Em ( 1 + min max Pr[zp ^ Wk (x*; Tj?) > At] ) (15) 

V k x*ga* ^ V 


fee {1,2}. (7) 



where em = e + (logM) Then, for every (Z,M, e)-VLSF 
code, we have 

OO 

(16) 

t=o 

Proof: See Section HVl ■ 


C. Asymptotic expansion 

Analyzing (fOT l and (fThl) in the limit I —?> oo, we obtain the 
following asymptotic characterization of e). 

Theorem 3: Let Zk ~ A/'(0,1), V = S/V 1 V 2 , Qk = 
and let y = Q~^{x) be the solution of 

n Qi~8ky) + x(l + mmQ{-gky)] = 1. (17) 

k=l ^ ^ 

For every discrete memoryless broadcast channel with C\ = C 2 
and every e € ( 0 , 1 ), we have 


Cl 

1 - e 


< logM;(),e) 


< 


Cl 


- Sc\/I +C>(logZ) (18) 


where ^ > 0 is an arbitrarily small constant. 


' Vl + ^2 
27r(l — e) 


(19) 


and 


V 


(l-e)3 

—e ( 2Q~^(e) — minE 


E min|(3 ^(e) ,maxpfcZfe|| 
in|Q“^(e), 


( 20 ) 


Proof: The converse bound in (fTSl l is proved in Section IV] 
and the achievability bound is proved in Section]^ ■ 

Remark 4: When Ci f C 2 , it can be shown that the square- 
root penalty on the LHS of (fTSl l vanishes. In this case, the prob¬ 
lem reduces to the point-to-point transmission to the weakest 
decoder, for which the zero-dispersion result in fS) applies. 

Remarks: For the case when PYii,Y'ii\Xi does not sat¬ 
isfy (O, a bound similar to the LHS of (fTST i can be obtained 
by replacing 5a in (fT9l l with 


+ V 2 - 2 Cov(*p.,H^,(X;yi),zp..M/,(W;y 2 )) 


27r(l — e) 


( 21 ) 


Remark 6: When pi = P 2 = 1 (and, hence, Vi = Vf), one 
can simplify the RHS of (fTsT l as follows: 

. ^ Cl I VI 

logM.,(l,.)< —-\/(r37)5 

X (1 - <3(^Q-‘(e))) + (e - 2)^.((3-'(e)) ) 

-O(logZ). ( 22 ) 


The second-order term in (l22ll is strictly negative for all e < 
0.1968. This implies that, when Ci = C 2 , Vi = V 2 , and 
e < 0.1968, the asymptotic expansion of log M*f{l, e) contains 
a square-root penalty. 

IV. Proof of Theorem[2] 

Fix M and e. To establish Theorem |2] we derive a lower 
bound on I that holds for all VLSF codes having M codewords 
and probability of error no larger than e. Since, 

OO 

I > E[max{ri, T 2 }] = X/ “ Pr[max{ri, r 2 } < i]) (23) 

i=0 

we can lower-bound I by upper-bounding Pr[max{ri, T 2 }<t] 
for every t G Z+. The following property (proven in Ap¬ 
pendix |I31* turns out to be useful. 

Property 1: Fix t G Z+ and a G [0,1], and suppose there 
exists an {I, M, e)-VLSF code with Pr[max{Ti, T 2 } < t] < a. 
Then there exists an {I', M, e)-VLSF code for some I' > I, for 
which Pr[max{Ti, T 2 } < f] < a and n, T 2 G + 

Fix an arbitrary (I, M, e)-VLSF code, defined by the tuple 
{fn, gi,n,g 2 ,n, Ti,T 2 , U). By Property [T] it is sufficient to con¬ 
sider codes for which ri, T 2 G {f, f -f 1, • • • }. Let u GU, 
be constants in [ 0 , 1 ] such that 'Yhu&A < e and 

Pr[J ^ gk,rMY,^^)\U = u] < eC\ 

Since {xk = n} G F{U,Y^), we can define a se¬ 
quence of binary functions (pk — {‘Pk,t, ‘Pk,t+i, • • • } such that 
Pk,n{u,y'^) = 1 {xk = n}. Let P-^'^ be the conditional prob¬ 
ability measure on X°° induced by the encoder given U = u. 
Define for u G W the set 3 )^“^ = {j/” G : (pk,n{u,y'^) = 1}. 
Note that we must have G yC\ Let the length of a 
sequence of channel outputs y G y^'^ be denoted by \y\. On 
yC '^, define the conditional probability measure IP^|’x > given 
X G and rt G W, as 

\y\ 

F‘^’;^{y\K)^Y[Wiy.\x,) (24) 


and the probability measure P^’^^(y,x) = 

P^l’“^(y|x)p4“^(x) on yC) x We also need the 

following auxiliary probability measure on y^^ 


E 

P^tEVt(X) 



(25) 


and the probability measure = Ci^y^ {y) P^\'x) 

on yC) X X°°. Here, Vt{X) C 'P{X) denotes the set of types 
formed by length-f sequences. 

Using the meta-converse theorem 13 Th. 27], the inequality 
li9] Eq. (102)], the fact that ^ i® ^ convex combination of 
distributions lITOl Lem. 3], and the upper bound |7^i(A’)| < (f-l- 
l)l'^|-i ifiYI Lem. 1.1], we conclude that (see Apnendix ll-Bll 


-[p(Jc,u) 

■^y,x 


< At 


— ^kM 


(26) 



















where ^ 

Where 


(log M) ^ and At is defined in (fl4ll . Here, 




\v\ 

E 

i=t-\-l 


log 


Wkiyi\y^i) 


(27) 


where Zfe(x‘;y*) = j/‘). Next, we minimize 

Pr[Tfc < t\U = m] over all stopping times satisfying i 

Pr[Tk<t\U = u] = ¥p^[\Y\=t] 


= P 


{k,u) 


y,x 


4“^(X;n) > At,|F| =f 


P 


{k,u) 

y,x 


7i“)(X;rfe) < At,|y| =f 


(28) 


< min 


{>. 


P 


{k,u) 

y,x 


4 “^(X;rfe)>At,|F|=f 


+ 4 


i} (29) 


< max Pr[ife(x* ;!)()> At] 


mini 4 m - 1 - max Pr[ife(x*; Yfc*) > At] 

I ’ 


(30) 


limit theorem is applied, and the case t > (3, where the trivial 
upper bound maxxtga't Pr[*fc(x‘; y^*) > A] < 1 suffices. 

In the first case, invoking Hoeffding’s inequality ifT^ Th. 2] 
and using that /(Px‘) Wk) is upper-bounded by C uniformly, 
we obtain (see Appendix III-Al for details) 

L «1 

max Pr[*fc(x‘; y)() > a] = o(1), A —>■ oo (36) 

t=o ^ 


and 

L“J 2 

E n ) > ^] } = o(l)- ^ oo. (37) 

t—0 k—1 


In the central regime, we use the Berry-Esseen central limit 
theorem ifTJl Th. V.3] to show that 


Here, (|29] | follows from (l26l l. Since the stopping times ti and 
T 2 are conditional independent given U = u, (l30l l implies that 

2 

Pr[max{ri,r2} < t\U = u] = Y[ ^F.x [l^^l = 

k^l 

2 

- n > ^t]} 

A;* 

+ mm I ef4 + ^44 ^ 4) > | • (32) 

Note that (l32l l holds for all that satisfies (l26l l. Averaging (l32T l 
over u G U and using the inequality Pu{u)e[% < e + 

(log M)~^ = £m, we obtain (flSl l. The proof is concluded using 
(|23]l. 


Pr[*fe(x‘;F^‘) > A] <Q 


V / 


€ 

Vt 


(38) 


We next maximize dMt over x* G X* following the ap¬ 
proach in ifTOl Prop. 8]. Specifically, we use continuity prop¬ 
erties of I{P, Wk) and V{P, Wk) for probability distributions 
P G V{X) close to P* to show that (see Appendix lII-BI l 


1/31 

E max Pr[ifc(x‘;yfe‘) > A] 

t=[a\+l 


< 


ivx 


Q ye)-Emin|Q ^e), ^ -h C>(log A) 

(39) 


V. Asymptotic Analysis: Converse Bound 
We analyze Lt in (flSl l in the limit Z —oo. By (fThl l. 


where are defined in TheoremOand Zk ^ A/^(0,1). Simi¬ 
larly, we obtain 


oo L/3J L/31 

^>E(1-^*)^^E(1-^‘)^^E(1-^*) (33) 

i—0 t—0 t—0 

where /3 > 0 will be specified shortly. Let A = logM — 
log log M — 5 — {\X\ — 1) log(/3 -I-1). For all f < /3, 

max Prfi/c(x‘;14*) > At] < max Pr^^fe(x*; > A] . (34) 
x‘ga* ■' x‘ga* ■' 


L/3J 2 

E n P 4 *fe(^*; pfe) > A] 

t=[aj+l k=l 




C3 

0{\ogX). 


min|Q ,maxgkZk 


(40) 


The key step is to establish an asymptotic upper bound on 
maxxtga't Pr[*fe(x‘; 1)() > A] for every t G as A —^ oo. 

and let /3 be the solution of 


Leta^ ^ 


{X-f3C)/^=-Q-\e) 


(35) 


where C is given in ©, V is defined in Theorem |2 
and Q~^{e) in dnii. We divide the asymptotic analysis of 
maxxtga't Pr[*fe(x‘; ) > A] into three cases: the “large devi¬ 
ations regime” t G [0, a), where we use Hoeffding’s inequality, 
the “central regime” t G [a, /3), where Berry-Esseen central 


Using d^ . d^ . (ITTT i. d^ . and d40l i. we obtain 


1/31 

t^o 

A(1 — Em) 


(41) 


> 


C 


C3 ' 


—£m (^ 2Q ^ (e) — mm E 

-O(logA) 


Jq ye),maxpfcZfe 


i|q ^{e),gkZk'^ 


( 42 ) 


























as A —oo. Finally, we have that 


A = log M — log log M — 5 — {\X\ — 1) log(/3 + 1) 
Cl 

< - 

1 — Em 



E 


min 


Q ^(e) 

k 


(43) 


Em ( 2 Q ^(e) — minE 
k 


lin {Q ^(e),efc^fe| 


+ 0{\ogl) (44) 

as I ^ oo. The final result in (fT^ is obtained through algebraic 
manipulations. 


upper-boundE[max{Ti, r 2 }] using the following lemma, which 
is proved in Appendix Hill 

Lemma 1: Let {Wn} and {Zn}, n> 1, be i.i.d. discrete RVs 
with {Wi,Zi) ^ Pw,z, positive mean — E[lL^i] and = 
E[Zi], respectively, and finite moments of order r > 3, i.e., 
E[|lLi|''] < oo, and E[|Zi|’’] < oo. Define the random walks 
Un = S"=i and Vn = Zi, and the stopping times 

Ti = inf{n > 0 : C/„ > 7 } and T2 = inf{n > 0 : 14, > 7 } for 
every 7 £ M. Then 

(7 I 

E[max{Ti,r2}] < -r-. -p + -1 

-1_ 0^74^42 log 7^ (55) 


VI. Asymptotic Analysis: Achievability Bound 

Set P = P*, and fix r G N, g = jtEy’ and I' > 0, a 
parameter that will be related to the average blocklength. Let 
the thresholds be chosen as follows: 


as 7 — 00 , where = Var 


Wi 




Lemma[T]implies that there exists a constant bi such that 


E[max{ri(7),r2(7)}] < ^ + 5 ( 7 ) (56) 


1 ^ jk = C {I'- g{Cl')). (45) 

Here, 

gix) = logo; (46) 

where bi will be specified later. If we choose a code with a 
number of codewords M that satisfies 

\ogM^C{l'-g{Cl'))-\ogl' (47) 


for sufficiently large 7 . The conditional average blocklength of 
the VLSF code can be bounded as follows 

E[max{ri,T 2 }] = E[max{ri( 7 ), r 2 ( 7 )}] (57) 

<^+ 5 ( 7 ) (58) 

= 1'- g{Cl') + g{Cl' - Cg{Cl')) < I'. (59) 

Here, (l58l l holds by (l56l l. and ( |59] | follows by the definition of 7 
in (l45l) and the fact that g{x) is nonnegative and nondecreasing. 


we have {M — 1) exp {— 7 } < 1 /I'. Furthermore, by Remark|3] 


the average probability of error is upper-bounded by 


[1] 

7 + (1 - - l)exp{- 7 fc} 


[2] 

- I'-l I'-l 1' 

(48) 

[3] 

Suppose it can be shown that 


[4] 

E[max{ri, T 2 }] < 1' 

(49) 

[5] 

for sufficiently large V . Then the average blocklength is 


[6] 

(1 - g)E[max{ri,r 2 }] < — yl' = 1. 

(50) 

[7] 

[8] 

Consequently, by Theorem [1] there exists an {l,M,e)- 
code with 

■VLSF 

[9] 

logM > logM 

(51) 

[10] 

= C{1' -g{Cl'))-logl' 

(52) 

[11] 

Cl 1 Vi+Vo r , '■+1 


(53) 

[12] 

where the last step follows because 


[13] 


(54) 
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To establish ( |49] ). we proceed as follows. Let 144 = 
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Appendix I 

Steps Omitted in the Prooe of the Converse Bound 
A. Proof of PropertyU} 

Let (/„, 5 i,„, g 2 ,n, ti,T 2 ,U) be a tuple defining an {I, M, e)- 
VLSF code with Pr[max{ri, T 2 } <t]<a. Set 


Tk = 


and 




t, Tk<t 
Tfc, Tk ^ i 


fffc.n («,%''), Tk<n 

gk,niu,y^), Tk>n. 


larger than under ^ *1° larger than 1 — 1/M un¬ 

der ■ Hence, using the meta-converse theorem ||9l Th. 27] 
and the inequality |l£| Eq. (102)], we conclude that 

log M < log 7 ^“^ 


fk,u) 


log (. 


P 


{k,u) 

n,x 


*i“^(X;n)<log7l' 


(“) 


— e 




for all 7 ^“^ such that 


Here, 


P 


n,x 

n,x ■*' 


^fc(X;n)<log 7 /' 


(u) 


> el 




zr(x;y.)^iog -log 


P^i’x^yfelx) 




for all X e X°° and all yk € 3^^“^ Let now 


(logM) ^ and set 7 ^“^ = 7 ^“^ where 


7 ^“^ = supji^ e 


: P 


{k,u) 

Y.X 


tl^\x-,Yk) < \ogv 




P 


{k,u) 

y.x 


*i“)(X;n)<log7i“^-<5 




fk,u) 

y.x 


— ^k,M — 

Using (i64l l in (i62l l. we obtain 


*l“^(X;n)<log7i“^ 


log M < log 7 ^' 


0 


log ( 


P 


{k,u) 

y.x 


*i"^(X;n)<log7r 


(“) 




< log 7 ^“^ + log log M. 


Finally, by (i65l l and (i67l i. we have 


P 


{k,u) 


y,x 


(X; n) < log M - log log M - 5 

4“)(X;n)<log7i“)-5^ 


- y,x 

— ^k,M’ 


( 68 ) 

(69) 


(60) 


( 61 ) 


Using Gol Lem. 3] and the fact that is a convex 

combination of distributions, we obtain the following relation 
between (x; y) and 7^,“^ (x; y) 


Note that 77 is also a stopping time with respect to the hltration 
J-{U, Y^) for fc G {1, 2}. Since Tk is a function of U and YJ} 
given Tk < n, the new decoder gk^n is well-dehned. Moreover, 
the decoders „ and gk^n yield the same probability of error. 
Thus ifn,h,n, g 2 ,n, Ti ,T 2 ,U) dehues an ((', M, e)-VLSFcode, 
with I' > 1. 


< 7^“4x;y) - log 


1 


\Ptm' 


(70) 


The inequality in (i69b can then be rewritten using (iTOi) . as 
follows: 


B. Proof of ( 

For each decoder k, the average probability of error is no 


^fe.M — -^y.x 


— y,x 


7^“^ (X; Yfc) < log M - log log M - (5 


4“4X;n) < A* 


-log|lPt(A)|] (71) 
(72) 


(62) 


(63) 


Here, (i72l i follows by the dehnition of At in (fT4l) . and because 
the number of types \'Pt{X) \ is upper bounded by (f-f 1)1'^“^! 
ifTTlLem. 1.1], 

Appendix II 

Steps Omitted in the Asymptotic Analysis of the 
Converse Bound 

We will need the following property, whose proof follows 
from standard algebraic manipulations. 

Property 2: Fix arbitrary a; G M, a > 0, 6 > 0, and A > 0. 
Suppose that ^ > 0 is the unique solution to the equation 


A — 


= X. 


u) 1 

,MJ • 

(64) 


Then 


0 <^- 


Note that there exists an arbitrary small positive constant <5, 
which is independent of log M, such that 



(73) 


(74) 


For notational convenience, we will denote the mean, vari¬ 
ance and third absolute moment of Zfc(x*; Y/) by 


(65) 


7fe(Px*) = 7 (PxG 
Vk{P^^) = V{P^.,Wk) 
Tk{P^^) =T{P^.,Wk). 


( 66 ) 

(67) 


According to (i74l l and since (3 satishes dTSl l. we have 


0</3-|^+Q I < <E- 


(75) 

(76) 

(77) 


(78) 

























This also implies that for all Px‘ G n^, 


A. Proof of (l3^ and (iJTl i 

For the case t < [0, a), we use the following large-deviation 
bound 


max Pr[jfe(x*; Yfc‘) > A] 

A:* 


< max Pr 

x'GA't 


^ -Ik (-Px* )> -- h (-Px‘ ) 


< max exp —c 
x‘Ga'‘ \ 

< exp(—clog^ A) 


A — tik (-Px*) 


< 


c log A 


(79) 

(80) 
(81) 
(82) 


where dSOl l follows from Hoeffding’s inequality ifT^ Th. 2] and 
dSB follows because t < a and because Ik (P’x*) is uniformly 
upper bounded by C. It follows from (l82l i that 

[aj 






< (a + 1) ( - 


c log A 


<c(- 


= o(l). 

Using similar argument, one establishes (iJTl i. 


c log A— 1 


(83) 

(84) 


B. Proof of (l39l) and (l40l i 

For the case when t G [a, /3), we need tighter bounds on 
Ik (P’x*) 14 (Px* )• Let be the set of probability distribu¬ 

tions that are at distance no larger than p from P*: 


IV^^{PGV{X)■.\\P-P*\\^<^Ji]. 

(85) 

Here, P — P* 2 — ~ Bounds on 

Ik{Pyit) and Vk{Pxt) are then supplied by IfTOl Lem. 7], which 
yields positive constants <j, /r and p for which 

4(Px*)<C-c||Px*- 7^112 

(86) 

14 (Px*) > Y 

(87) 

and 


|\/'14(Px*)- \A4| <p||Px* -PII 2 

(88) 


for all Px* G n^- 

LetPx* G n^. The Berry-Esseen central limit theorem yields 
the following estimate 


Pr[jfe(x*; Yfc‘) > A] 

^ f^( A - tik (Px*) ^ ^ atTk (Px*) 

- (i^fc(7^x*))3/2 

< o( ^ ~ (^x*) \ ^ 

" V ) Vi 


(89) 

(90) 


where the last inequality follows from dSTli and because 
Tk{Px*) < <E uniformly in 11^. 


max 

x*GA:’* 


{Pr[ii(x*;y/) > A]} max {Pr[i 2 (x*; > A] } 

■v^ 


< I I max Q 

x*Ga'* 


k=l 


x*Ga'* 

A — tik (P’x*) 

\/fVfc(Pxt) 




For the case when Px* ^ 11^, we use Chebyshev’s inequality 
to obtain the estimate 


Pr[*/c(x‘;yfe‘) > A] < 


tVkjP^^) 

(A-f4(Px*))2 


(92) 


for all A > tIk{Px*)- Since P^t ^ FI^, there exists a constant 
C such that Ik (Px *) < C < C. Flence, for sufficiently large 
A, the condition t < p implies that A > tIk{PxI)- Therefore, 
by ( |92] ), we have that 


^max Pr[zfc(x*;yfc*) > A] 

/ tVkiP^^) 

< max --^---TT 

Px*^n^ {\-tIk{P^t)Y 

<ct 

- (A - tC'Y 

^ cA 

“ (A-AC"/C'-c\/A-c)2 
c 

< - 

- A 


(93) 

(94) 

(95) 

(96) 


where we have used that t < (3 <2t for sufficiently large A and 
that Vk{Pxt) is uniformly upper-bounded ifTOl pp. 7048]. We 
see that maxp^j^n^ Pr[zfe(x*; Y]f) > A] can be driven arbitrarily 
close to zero by having A sufficiently large. This implies that we 
only need to consider the input vectors x* for which Px* G 11^, 
i.e.. 


max^Pr[*fc(x*;Lfc‘) > A] 


- > A] + 


Using (l90l l and dOTT i. we obtain 


max^Pr[*fc(x‘;Lfe‘) > A] 

A — tik (Px*) 


< max Q 

P„tGn„ 


ViV^ 


. A-fPfe(Pxt) 

< Q min 


c c 

A 

c c 


P.*6n, v/ii4(Px*) y 

A — tik (P’x*) 


(97) 


4 >{x)i 


^.*en^ VmiPV) 


<z\dz + 


(98) 

(99) 

c 

Vt 

( 100 ) 
























for all sufficiently large A. The indicator function in (llOOl l can 
be upper bounded as 


Appendix III 
Proof of Lemma[T] 


A — fJfc(Px*) 


-z'> <0 


= 1 "I max {f4(PxO + z^JtVk{P^.) - a} > o| (101) 

< 1 jiC - 4^^ + z\/Wk + \z\y/ip^ - A > o| 

< 1 jfC + zs/Wk + ^ ^ ^ o} 


( 102 ) 

(103) 


< 1 


< 1 


■ A - - fC 




< z 


C 


\Vk 


Ap 


C3 2C'<t 


< t 


(104) 

(105) 


where (llOll l follows since y^tVki'x*) > 0 for P^t G 11^ by dSTl i. 
(fT02l l follows by dMll and ^ with ^ = ||Px‘ - (flM l 

follows because —+ Izlp^'/i is a quadratic expression in 
with maximum and (11051 1 follows from (l74l l. The steps 
(1101 l) - (l 103b essentially follow from ifTOl Prop. 8 ]. Substituting 
(fTOSl) into (1 100b and summing from ([aj + 1 ) to , we obtain 


L/31 

max Pr[zfc(x‘;Yfe*) > A] 
L/31 .oo 


si:/ 


+ 0(logA) 

POO 


A VkX \z\p 


2C^ 


(106) 


< 


< 


' 0 ^ —oo 

fO(logA) 

POO 

/ m I 

’ —oo j 

VO{\og\) 


c 


A>{z)t ^ ^ ^ ^ ^ 4 df 


(107) 


A _ Yh^ _ 

C C3 2Cc; 


< t} dt dz 


(108) 


< /3-E 


min-^ PA—-Zk 


VkX 

(J3 


O(logA) 


<V^(Q-Ae)-E 


(109) 

lin {Q-\e),gkZk}])+O{l 0 gX) 

( 110 ) 


where pk are defined in Theorem [3] and Zk ^ ■Xf{0, 1). Here, 
(flOTb follows because the indicator function is nondecreasing 
in t, in (1108b the order of the integrals is interchangeable by 
Tonelli’s theorem, and in (1109b we have used (l74t . 

By following the same approach, we obtain (l40l i. 


Fix 7 G R. We define the following two random walks, which 
are equivalent to Un and Vn, but more convenient to analyze: 


An = Un/PW + Vn/PZ 
Bn = Un/PW — Vn/PZ- 

We also define the additional stopping time 

A . r f ^ „ /I ^ 

ri2 = mt< n > 0 : An > 7- 

L PwPz 

We shall next show that 


( 111 ) 

( 112 ) 


(113) 


E[max{ri,T 2 }] < E[ti 2 + r {(7 - Ur,^) + t '(7 - 

(114) 

where t((-) and are defined as 

A(7) = inf > 0 : ^ l^i > (115) 

AA) = infjn > 0 : ^ > 7| (116) 

and where {Wk,Zk} are i.i.d. and {Wi,Zi) ^ Pw,z but 
independent of Wj, Zj for all j G N. Note that t[ and are 
independent of Uri 2 ^nd 14 ^ 2 . 

To prove (II 14b . we use the following argument. At time ri 2 , 
we have that Uri 2 /Pw+Vr^^ /pz > 7 ^////wYz ' implies that 
either ti < ti 2 or T 2 < ti 2 (or both) are satisfied. Consider the 
case Ti < ti 2 and T 2 > ri 2 . To bound E[max{ri, T 2 }], we need 
to characterize the remaining time until the random walk Vn hits 
the threshold 7 . This time is given by min{n > 0 : Vr^ 2 +n ^ 
7 }, which has the same distribution as (II 16b computed at 7 — 
Fri 2 ■ Note also that t/ ( 7 ) = 0 for every 7 < 0 since we use 
the convention X]i=i(’) = f*. The inequality in (II 14b follows 
because there exist events for which max{ri,r 2 } < ri 2 . The 
case T 2 < ti 2 and ti > T 12 can be analyzed similarly. 

By im Th. 3.9.4] (or by Wald’s equality when Wi and Zi 
have bounded support EEq. (106)-(107)]),we have 


< E[r((7)] < -A— + c 
Pw Pw 

— < ]E[r2(7)] < ^ + € 
Pw Pz 


(117) 

(118) 


Pw + Pz , 1 , Pw + Pz , 

7“T7— 7 —< E[ti 2 J < 7 - 7 -—^-l-c- (119) 


‘2-pwPz 


‘2-pwPz 


Using (II 14b . the linearity of expectation, (II 17b -( ll 19b . and the 
fact that 


E[u (7 - Un, 2 )] = E[E[r ((7 - 

< A-E[(7-/7.,2)+] +c 

Pw 


( 120 ) 

































we conclude that 


E[max{ri,T 2 }] - 7 


^^w + ^^z 


< -E 

fJ-W 

= —E 
fJ-W 

+ —E 
fJ-z 


h-Ur,,y 


1 

^J'Z 


■E 


(7-K.J 

+' 


+ 


7 2 


7 (^T12 ^^ 12 ) 


< E 


f 1 1 

2 


Mty + 
^^w^J■z 


7 - 


-E 




jy 1 / ^J‘W + Hz 

^z 2 \ ^J-w^^z 




-Br 


fJ-z — ^J-w 
i - 

fJ-WfJ-Z 


-B. 


+ C 


Pr 


Nn 

— - 1 
nv 


^ Cn 


= o 




for some constant v and a sequence {Cn} that vanishes as n 
00 and that satishes - < Cn for all n. Then 


sup 

agk 


Pr 


'N„ 

< ay/mX 

.i=l 


-<i>(A) 


The RV i 3 ri 2 its variance satisfies l|4] Th. 4.2.4 (ii’)] 




as 7 


A Wrj, Zr^ 

A^W 

Note that by (II 19b . we have 


E[iV„] = E[ri 2 ( 7 n)] 
^^w + ^^z 

= vn + Oil), 


0 ( 1 ) 


We next show that condition (1125b in Lemma |2] is satished. 
Indeed, 


Pr 


+ c ( 121 ) 


\ Nn 

' 

— - 1 

IV 

vn 

. 


< 


( 122 ) 


Pr 

Nn — vn 

y/vn 


E 

Nn — i'n 

r- 



( 

y/^Cn) 

r 




(V^CnY 

c 


= o(v^) 


(131) 

(132) 

(133) 

(134) 


+ (E (123) 
(124) 


n’’/^ (n S^’+l ^ 72 4 r +2 

as n — 00 . Here, (1132b follows from Markov’s inequality and 
(fT33T l follows from H Th. 3.8.4(i)]. 

Let F{X) = Pt[Bn^ < ay/vnX]. We can now use Lemma|2] 
which for sufficiently large n implies that 


where (1123b follows from the dehnition of T 12 (see (II 13b ) which 
implies that At-,, > 

We next show that the RHS of (1124b is upper-bounded by 
the RHS of (l55b by the following two steps. First, we shall 
approximate Bt^^ by a Gaussian RV using a variation of the 
Berry-Esseen theorem that holds when the number of terms in 
the summation is a RV (see Lemma |2] below). Then, we shall 
establish (1551) using standard properties of Gaussian RVs. 

Lemmal: ([|5] Th. 1]) Let n > Ijbei.i.d. RVs with zero 
mean, positive variance u^, and hnite third absolute moment. 
Let {Nm n G N} be a sequence of positive integer-valued RVs 
and assume that 


sup |F(A) — $(A)| < cn *'■+=. 
AgK 


(135) 


We next rehne our estimate in (1135b using Lemmaj^below. 

Lemma 3: (' lfT3l Th. 9]) Let F{x) be the cumulative distribu¬ 
tion function of a RV that has finite moment of order p. Suppose 
that 0 < A = sup^ — 4>(a;)| < l/\/e. Then there exists 
a constant Cp, that depends only on p, such that 


\F{x) — 4)(a:)| < 

for all X. Here 


CpA{logiY^^ + Pp 


Pp — 


|a;|^dF(a;) — 


l + \x\P 


|a;|^d$(a;) 


(125) 


Using Lemma [ 3 ] and (1135b . we have that 

|i.(A) - 4.(A)| < ■^"~7‘7 a"+ 
for A G R and sufficiently large n. Here, 


. (126) 


P2{n) = 


Var[5 


N„ 


- 1 


n + 0{l) 


- 1 


< 


(136) 


(137) 


(138) 


(139) 


Fix an arbitrary a G R. Using (1138b . we obtain the following 
upper bound 


E[|a - BnJ] 


(127) 


00 . For some constant n > 0, let 7 „ A ‘^’^p-wpzn 

pw+p-z 

Nn = ti 2 ( 7 „), ^ and Cn = n~^ for n G N. 


= ay/izn 


< CTV izn 


1 + F 


$ 


-x] -F 


dx 

(140) 


-X -f l-$ 


(128) 

(129) 

(130) 


cn '‘'■+2 logn -(- c/n ^ cn '‘'■+2 logn -(- c/n 


l + (-^-fx)2 

X<7^/yn. > 

= ay/iznK —:= — Z 


l + (-7- xY 

A ( 7 ^/vn. ' 


dx 

(141) 


<7y/Vn 





















































































+ TTCr 


v^( cn 2 <‘'■+2 logn + c/i/^ 

a 


y/Unip 


+ |a| + 0(n*''+2 logn) (143) 


(142) 

< 4'( 

7 


(154) 

^min{/rvv,Atz} 

(143) 

< T-f 

" 7 ' 

1 + c 

(155) 


as n —>■ c», where Z ^ A/'(0,1) and 


+ 


O' hifj-w + fj-z) 


“ 1 ^ 1 ) 


= exp 


X 

'T 




(144) 
- sgn(x) ) . (145) 


20r y fJ-wfJ-z 

+ 0 ( 74 ^ log 7 ) 


1 {f^W = ^J-z} 


(156) 


■ r ^ -7 + = ^^Z} 

mm{fj,w,tJ‘z} v27rVMiv 


_|_ 0 (-y 4 r +2 log 7 ). 


(157) 


The positive function ipi^) 1® unimodal with maximum 1 at¬ 
tained at X = 0 and decays exponentially to 0 as |x| —^ 00 . 

Substituting a = 7w (I1431 I. we Here, (11511) follows because E[max{ri( 7 ), T 2 ( 7 )}] is nonde- 
obtain creasing in 7 . 


E 


fJ-z - IJ-w „ 

In - Bn„ 

^J'W^^Z 


/ /2 ^— f - fj-w)\ 

- \ - 7 -^^ 

V tt V a■(^lw + fj-z) J 


In 


fJ-z — ^J-W 


+ 0(n*’-+^ logn). 


(146) 


fJ'Wt^Z 

Note that for the case fiz ^ IJ-w, we have that 


2^/Z^{fJ.z-IJ.w)\ _ /IN 

a(t.w+t^z) J - ^ 

([14^ into (fT24l l. we obtain 


00 . Substituting 


E [max{ri ( 7 „), T 2 ( 7 „)}] 


< 7i 


fJ-w + f^z 
2f^wMz 
/J-z 




In 




^J-W^J'Z 


- B. 


Tl2(7n) 


0 ( 1 ) 


= 7n 


^J■w + 
^fj-wfj-z 


fj-z — fj-w 


2fJ,wf^z 


7 


y/iml {fiw = fJ-z} + 0 (n -‘'-+2 logn) 


+ 


uim{fiw, IJ-z} 


{pvv = ^J-z} 


(147) 


(148) 


r+l 

+ 0(714^+2 logn), n —>• 00 (149) 

where (11491 ) follows from the identity a + 6 + |a — 6 | = 
2 max{a, b}. 

To complete the proof, let m = “ 

X + {^iw = f^z} + bix*^ logx, and set n = 

fJ-W+M-Z J g 

2 max{/ 2 vr,Mz} ’ 


7 „ = min{pH 7 , Mz} n. (150) 


Note that T' (x) is nondecreasing, concave and differentiable in 
X S [1, 00 ]. Then there exists a constant 61 > 0 such that 


E[max{ri(7),T2(7)}] 

< E[max{Ti(7„J,T2(7„J}] 

a _ ^+1 

< ni + —=y/im^l {fiw = fJ-z} + binl"^'^ logni 

V 27r 

=vi,fr _ 2 _ 1 ) 

V min(piy,^2) J 


(151) 

(152) 

(153) 
















































